Overview

Brought to you by YData

Dataset statistics

Number of variables1
Number of observations7516765
Missing cells0
Missing cells (%)0.0%
Duplicate rows35441
Duplicate rows (%)0.5%
Total size in memory400.1 MiB
Average record size in memory55.8 B

Variable types

Numeric1

Alerts

Dataset has 35441 (0.5%) duplicate rowsDuplicates

Reproduction

Analysis started2024-12-19 02:21:29.437345
Analysis finished2024-12-19 02:26:22.753046
Duration4 minutes and 53.32 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

cod_taric
Real number (ℝ)

Distinct40373
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.4856517 × 109
Minimum1
Maximum9.9909902 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size57.3 MiB
2024-12-18T21:26:22.813051image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile62
Q19102
median6031100
Q36.0210909 × 108
95-th percentile8.4818074 × 109
Maximum9.9909902 × 109
Range9.9909902 × 109
Interquartile range (IQR)6.0209999 × 108

Descriptive statistics

Standard deviation2.8604631 × 109
Coefficient of variation (CV)1.9253928
Kurtosis1.3962501
Mean1.4856517 × 109
Median Absolute Deviation (MAD)6031084
Skewness1.720827
Sum1.1167294 × 1016
Variance8.1822489 × 1018
MonotonicityNot monotonic
2024-12-18T21:26:22.909562image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99 38264
 
0.5%
84 36480
 
0.5%
9990 36090
 
0.5%
9990000000 30705
 
0.4%
999000 30705
 
0.4%
99900000 30705
 
0.4%
85 24515
 
0.3%
39 21380
 
0.3%
61 21125
 
0.3%
49 20228
 
0.3%
Other values (40363) 7226568
96.1%
ValueCountFrequency (%)
1 1091
 
< 0.1%
2 2926
 
< 0.1%
3 14042
0.2%
4 1842
 
< 0.1%
5 1216
 
< 0.1%
6 12565
0.2%
7 7235
0.1%
8 10516
0.1%
9 7895
0.1%
10 2686
 
< 0.1%
ValueCountFrequency (%)
9990990200 2794
 
< 0.1%
9990990100 110
 
< 0.1%
9990940000 4275
 
0.1%
9990240000 94
 
< 0.1%
9990010000 456
 
< 0.1%
9990000000 30705
0.4%
9931990000 35
 
< 0.1%
9931270000 4
 
< 0.1%
9931240000 4
 
< 0.1%
9930990000 2195
 
< 0.1%

Interactions

2024-12-18T21:21:45.019589image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Missing values

2024-12-18T21:26:21.351274image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-12-18T21:26:21.641823image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

cod_taric
012
116
284
387
499
50603
66106
76114
88441
98531
cod_taric
75167557009920000
75167567117199990
75167577308400000
75167587607201000
75167598473309000
75167608536490099
75167618537109899
75167628538909999
75167639905000000
75167649990000000

Duplicate rows

Most frequently occurring

cod_taric# duplicates
353399938264
248838436480
35423999036090
3542499900030705
354259990000030705
35426999000000030705
286568524515
116163921380
179126121125
151514920228